Multiple-pronunciation lexical modeling in a speaker independent speech understanding system

نویسندگان

  • Chuck Wooters
  • Andreas Stolcke
چکیده

One of the sources of difficulty in speech recognition and understanding is the variability due to alternate pronunciations of words. To address the issue we have investigated the use of multiple-pronunciation models (MPMs) in the decoding stage of a speaker-independent speech understanding system. In this paper we address three important issues regarding MPMs: (a) Model construction: How can MPMs be built from available data without human intervention? (b) Model embedding: How should MPM construction interact with the training of the sub-word unit models on which they are based? (c) Utility: Do they help in speech recognition? Automatic, data-driven MPM construction is accomplished using a structural HMM induction algorithm. The resulting MPMs are trained jointlywith a multi-layer perceptron functioningas a phonetic likelihood estimator. The experiments reported here demonstrate that MPMs can significantly improve speech recognition results over standard single pronunciation models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple-Pronunciation Lexical Modeling Based on Phoneme Confusion Matrix for Dysarthric Speech Recognition

In this paper, we propose speaker-dependent multiple-pronunciation lexical modeling for improving the performance of dysarthric automatic speech recognition (ASR). For each dysarthric speaker, a phoneme confusion matrix is first constructed from the results of phoneme recognition. Then, pronunciation variation rules are extracted by investigating the phoneme confusion matrix, and they are incor...

متن کامل

Lexical Modeling in a Speaker Independent Speech Understanding System

Over the past 40 years, significant progress has been made in the fields of speech recognition and speech understanding. Current state-of-the-art speech recognition systems are capable of achieving word-level accuracies of 90% to 95% on continuous speech recognition tasks using 5000 words. Even larger systems, capable of recognizing 20,000 words are just now being developed. Speech understandin...

متن کامل

Advantages of Using Computer in Teaching English Pronunciation

Pronunciation continues to grow in importance because of its key roles in speech recognition, speech perception, and speaker identity. Computer is being increasingly used in teaching English pronunciation to enhance its quality. The purpose of this paper is to discuss the advantages of using computer in English pronunciation instruction. Understanding the advantages of computer is an important ...

متن کامل

Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling

A large vocabulary Taiwanese (Min-nan) speech recognition system is described in this paper. Due to the severe multiple pronunciation phenomenon in Taiwanese partly caused by tone sandhi, a statistical pronunciation modeling technique based on tonal features is used. This system is speaker independent. It was trained by a bi-lingual Mandarin/Taiwanese speech corpus to alleviate the lack of pure...

متن کامل

Pronunciation lexicon adaptation for TTS voice building

This paper describes reducing phone label errors in TTS voice building by means of modeling of speaker pronunciation variants. Each speaker has his or her own unique pronunciations (and context-dependent variations), so that no one standard lexicon is able to cover all of the speaker’s variations. Creating speaker-dependent pronunciation lexicons for automatic speech labeling of our TTS voice d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994